1 Introduction

Researchers have previous suggested that the response diversity of a community be measured by the diversity of responses to environmental change. For example, one can measure the response of each of the species’ intrinsic growth rate to temperature, quantify the strength and direction of these responses (e.g., as the first derivative of the response curve), and calculate the diversity of responses (e.g., by calculating variation in the first derivatives among the species in a community). When responses are nonlinear, the response diversity will be a function of the environmental state (i.e. the first derivative is a function of the value of the environmental state). So far we demonstrated this approach for quantifying response diversity in the context of a single environmental factor, but given that multiple environmental factors can change simultaneously, we need an approach that works in that context.

2 The principle

Owen learned about the mathematical principles from these youtube videos:

Imagine that the growth rate of a population depends on two environmental factors, e.g. temperature and salinity. We can represent the dependency as \(G = f(T, S)\), where \(G\) is growth rate, \(T\) is temperature, and \(S\) is salinity. It may be that the dependencies are linear, nonlinear, and with an interaction between temperature and salinity, hence our approach needs to be able to accommodate this phenomena.

The response of growth rate to change in temperature and salinity is the gradient / slope of this surface, with units of growth rate [per time] per temperature [degrees C] per salinity [parts per thousand]. Because the slope (first derivative) of the surface can (when dependencies are nonlinear) vary across the surface (location on the surface), and can vary in different directions on the surface, to calculate a slope we must specify the current environment (location on the surface) and the direction of change in the environment. The location on the curve is the current environmental condition, \((T_0, S_0)\), and the direction of environmental change is the unit vector \(\hat{u} = \langle U_T, U_S \rangle\).

Put another way, we calculate a directional derivative at a point on the response surface. We can write this as \(D_{\hat{u}}f(T_0, S_0)\) and can calculate it as \(f_T(T_0, S_0)U_T + f_S(T_0, S_0)U_S\), where \(f_T\) is the partial derivative of \(f(T, S)\) with respect to \(T\) and \(f_S\) is the partial derivative of \(f(T, S)\) with respect to \(S\).

Efficient evaluating in \(n\) dimensions can be done by taking the dot product of the partial derivatives at the location and the direction unit vector: \(D_{\hat{u}}f(T_0, S_0) = \triangledown f \cdot \hat{u}\) where, \(\triangledown f = \langle f_T, f_S \rangle\). (In R, the dot product of a and b is sum(a*b))

Figure 2.1 is an illustration of the principle of directional derivatives on a surface.

This figure needs considerable improvement! It is currently a keynote illustration.

Figure 2.1: This figure needs considerable improvement! It is currently a keynote illustration.

3 A simulated empirical example

Numerous mathematical functions have been used to represent how organismal performance changes with an environmental driver citation require. Moreover, multiple mathematical functions have been used to represent an interactive effect of two or more environmental drivers on species performance e.g. Thomas et al 2017. We have to decide if we are going to make simulations with different of these functions. This might increase our confidence that the method for calculating response diversity is robust to variation in types of response curve. If we decide not to, then we should likey argue that we think its robust.

3.1 Simulating performance curves

Let us use the Epply performance curve, which was used, for example, in this paper Bernhardt et al. 2018.

With one environmental variable, the performance (i.e., rate) is given by:

  • \(rate(E) = ae^{bE}(1 - (\frac{E - z}{w/2})^2)\)
  • \(E\) is the values of the environmental factor.
  • \(z\) controls location of maximum.
  • \(w\) controls range of \(E\) over which the rate is positive.
  • \(a\) scaling constant.
  • \(b\) controls rate of increase towards the maximum rate, as \(E\) increases.

Adding a second environmental variable gives:

\(rate(E_1, E_2) = a_1e^{b_1E_1}(1 - (\frac{E_1 - z_1}{w_1/2})^2) + a_2e^{b_2E_2}(1 - (\frac{E_2 - z_2}{w_2/2})^2)\)

In this case, it is clear the effect of \(E_1\) and \(E_2\) is defined as being additive. For example, the value of \(E_2\) does not affect the value of \(E_1\) at which the rate is maximised (\(z_1\)), and vice-versa (see also Figure 3.1)

Nonlinear and additive dependence of a rate on two environmental variables. (a) The value of \(E_1\) at which the rate is maximised is independent of the value of \(E_2\). (b) The value of \(E_2\) at which the rate is maximised is independent of the value of \(E_1\).

Figure 3.1: Nonlinear and additive dependence of a rate on two environmental variables. (a) The value of \(E_1\) at which the rate is maximised is independent of the value of \(E_2\). (b) The value of \(E_2\) at which the rate is maximised is independent of the value of \(E_1\).

Including an interaction. One way to do this is to make the value of \(E_1\) at which the rate is maximised depend on the value of \(E_2\):

\(rate(E_1, E_2) = a_1e^{b_1E_1}(1 - (\frac{(E_1 + z_{int21}*E_2- z_1)}{w_1/2})^2 + a_2e^{b_2E_2}(1 - (\frac{E_2 - z_2}{w_2/2})^2\)

When \(z_{int} = 0\) then this equation becomes the previously mentioned additive one. When \(z_{int} \neq 0\) then the value of \(E_1\) at which the rate is maximised is a function of the distance of \(E_2\) from its optimum value \(z_2\). In this particular equation, it is still the case that the value of \(E_2\) at which the rate is maximised is independent of the value of \(E_1\).

Nonlinear and additive dependence of a rate on two environmental variables. (a) The value of \(E_1\) at which the rate is maximised is independent of the value of \(E_2\). (b) The value of \(E_2\) at which the rate is maximised is independent of the value of \(E_1\).

Figure 3.2: Nonlinear and additive dependence of a rate on two environmental variables. (a) The value of \(E_1\) at which the rate is maximised is independent of the value of \(E_2\). (b) The value of \(E_2\) at which the rate is maximised is independent of the value of \(E_1\).

3.2 Simulating multiple species’ performance curves

3.2.1 No interacting environmental effects

First we create (or import) a table of parameter values of each species, with species in the rows and parameters in the columns. In the following example, only values of the \(z\) parameters differ among the species (which determine the location of the maximum rate).

For convenience we then convert the table of parameters into a list-column. We can then easily make performance curves of each of the species, and put those into a list-column in the same table.

## convert parameter table to a list-column of a tibble
par_list <- Partable_2_parlist(par_table)
## add performance curves
species_pars <- tibble(species = paste0("s", 1:s), pars = par_list) %>%
  rowwise() %>%
  mutate(expt = Make_expt(E1_series, E2_series, pars))

Here are some examples of the species’ performance curves (only with additive effects of \(E_1\) and \(E_2\)).

Performance curves for a species with maximum growth at low values of \(E_1\). Without interacting environmental effects.

Figure 3.3: Performance curves for a species with maximum growth at low values of \(E_1\). Without interacting environmental effects.

Performance curves for a species with maximum growth at high values of \(E_1\). Without interacting environmental effects.

Figure 3.4: Performance curves for a species with maximum growth at high values of \(E_1\). Without interacting environmental effects.

Performance curves for a species with maximum growth at low values of \(E_2\). Without interacting environmental effects.

Figure 3.5: Performance curves for a species with maximum growth at low values of \(E_2\). Without interacting environmental effects.

Performance curves for a species with maximum growth at high values of \(E_2\). Without interacting environmental effects.

Figure 3.6: Performance curves for a species with maximum growth at high values of \(E_2\). Without interacting environmental effects.

3.2.2 Interacting environmental effects

And now with interacting environmental effects…

For convenience we then convert the table of parameters into a list-column. We can then easily make performance curves of each of the species, and put those into a list-column in the same table.

## convert parameter table to a list-column of a tibble
par_list <- Partable_2_parlist(par_table)
## add performance curves
species_pars <- tibble(species = paste0("s", 1:s), pars = par_list) %>%
  rowwise() %>%
  mutate(expt = Make_expt(E1_series, E2_series, pars))

Here are some examples of the species’ performance curves (with interacting effects of \(E_1\) and \(E_2\)).

Performance curves for a species with maximum growth at low values of \(E_1\). With interacting environmental effects.

Figure 3.7: Performance curves for a species with maximum growth at low values of \(E_1\). With interacting environmental effects.

Performance curves for a species with maximum growth at high values of \(E_1\). With interacting environmental effects.

Figure 3.8: Performance curves for a species with maximum growth at high values of \(E_1\). With interacting environmental effects.

Performance curves for a species with maximum growth at low values of \(E_2\). With interacting environmental effects.

Figure 3.9: Performance curves for a species with maximum growth at low values of \(E_2\). With interacting environmental effects.

Performance curves for a species with maximum growth at high values of \(E_2\). With interacting environmental effects.

Figure 3.10: Performance curves for a species with maximum growth at high values of \(E_2\). With interacting environmental effects.

3.3 Fitting GAMs to noisy rate observations

Try with and without an interaction. Therefore make two species, one with no interaction z_int = 0 and the other with z_int = 0.1. All other parameters are the same. Note that noise is added to the rate observations.

GAM notes:

  • Owen watched this youtube video Introduction to Generalized Additive Models with R and mgcv by Gavin Simpson (author of the gratia package that we’ve been using to calculate derivatives of GAMs.). By the way, there is a part of this video about Model Selection, about Confidence Intervals, and about p-values for smooths (about 1h:57m to 2h:26) which I think is less useful for our purposes.
  • In the video mentioned above at about 54m:30s, there is the recommendation to estimate the penalisation parameter (\(\lambda\)) via restricted maximum likelihood, which is done by giving the the argument method = "REML in the gam() call (the default in mgcv version 1.8-36 is method="GCV.Cp").
  • Default smoother is a low rank thin plate spline (bs = "tp")
  • `s(E1, E2) will fit a bivariate isotropic thin plate spline. Isoptropic means there is a single smoothness parameter for the smooth. It is sensitive to the scale of \(E1\) and \(E2\).
  • Tensor products (te(E1, E2) have separate marginal basis, and therefore separate smoothness parameters. They are invariant to scales of \(E1\) and \(E2\). Therefore we should be using tensor products.
  • Specifying a model with te(E1, E2) does not then allow inspection of the main effects and interaction term–there is essentially only one smoother which contains the main effects and interaction. So it is difficult to know if the interaction term is required. An alternative is to specify the model as s(E1) + s(E2) + ti(E1, E2), in which the third term is then only the interaction part and allows a decomposition into the different effects. Useful for examining if the interaction is required.
  • To use cubic splines use bs = "cr".
  • There are many smoothers in mgcv
  • Choosing k (the basis complexity, which is the maximum wiggliness) is a bit of an art. The penalty then shrinks the used basis complexity, reducing the effective degrees of freedom. Must check that large enough k is provided; use gam.check(). Only cost to large \(k\) is computational effort.

Bottom line is that the gam picks up an interaction when we have included in the parameters used to generation the rates, and does not pick one up when we have not made an interaction. This confirms that our more mechanistic thinking and methods are matching our statistical thinking and methods, and confirms that each are promising, so far.

3.3.1 Without interaction

Performance curves for a species without interacting environmental effects and with some noise in the rate.

Figure 3.11: Performance curves for a species without interacting environmental effects and with some noise in the rate.

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## rate ~ ti(E1) + ti(E2) + ti(E1, E2)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.0154285  0.0004991   30.91   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##             edf Ref.df         F p-value    
## ti(E1)    3.999  4.000 14145.143  <2e-16 ***
## ti(E2)    3.937  3.997   609.233  <2e-16 ***
## ti(E1,E2) 1.001  1.002     0.023   0.883    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.958   Deviance explained = 95.8%
## -REML = -5815.8  Scale est. = 0.000648  n = 2601

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## rate ~ s(E1) + s(E2)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.0154285  0.0003952   39.04   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##         edf Ref.df       F p-value    
## s(E1) 8.970  9.000 10199.8  <2e-16 ***
## s(E2) 7.146  8.187   475.4  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.974   Deviance explained = 97.4%
## -REML = -6407.3  Scale est. = 0.00040624  n = 2601

3.3.2 With interaction

Performance curves for a species with interacting environmental effects and with some noise in the rate.

Figure 3.12: Performance curves for a species with interacting environmental effects and with some noise in the rate.

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## rate ~ ti(E1) + ti(E2) + ti(E1, E2)
## 
## Parametric coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0172349  0.0005248  -32.84   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##             edf Ref.df       F p-value    
## ti(E1)    3.999  4.000 21844.5  <2e-16 ***
## ti(E2)    3.936  3.997  1542.6  <2e-16 ***
## ti(E1,E2) 3.908  4.001   554.6  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.974   Deviance explained = 97.4%
## -REML =  -5679  Scale est. = 0.00071638  n = 2601

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## rate ~ s(E1) + s(E2)
## 
## Parametric coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0172349  0.0006269  -27.49   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##         edf Ref.df      F p-value    
## s(E1) 8.853  8.994 6893.8  <2e-16 ***
## s(E2) 5.303  6.429  671.3  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.962   Deviance explained = 96.2%
## GCV = 0.0010282  Scale est. = 0.0010222  n = 2601

3.4 Getting the directional derivative

First step is estimate the two partial derivatives \(f_{E1}(E1_0, E2_0)\) and \(f_{E2}(E1_0, E2_0)\) (please review the section The principle if necessary). This is not currently working. Please look at and perhaps run the R code for more details.

3.5 To do

  • The mathematical formulation of the interaction term is not very good. Too much changes when the \(z_int\) term is changed.
  • Collect data on response of growth rate to temperature and salinity for a collection of species. [We will simulate this, including response surfaces with combinations of linear and nonlinear responses, and without and with interactions among temperature and salinity.]
  • Fit a response surface to that data. [Do with GAMs with interactions.]
  • Specify the species present in a community and the temperature and salinity environment it occurs in. [We will simulate this, with different directions of change in temperature and salinity, e.g. independent, positively correlated, negatively correlated.]
  • Show examples of response diversity for the simulated environmental change scenarios and community compositions. [ Make function to calculate directional derivative at location, and calculate diversity of this across the species present in a specific community.]

4 Issues

  • The scales of the environmental change factors are different, and so some standardisation will be needed.

5 Left-overs

Also imagine that we are studying an ecosystem that can experience simultaneous change in temperature and salinity. For example, at a particular time, the temperature might be increasing by 0.1 C per day and salinity increasing by 0.2 parts per thousand per day.

Here some text that Sam wrote for a previous ms, that we removed from there, though please confirm this before using it in here The framework proposed herein allows derivation of environment-dependent performance responses. Yet, when species are exposed to multiple environmental stressors simultaneously—a situation common in empirical systems (Bowler et al. 2020)—it is not trivial to derive performance responses. Conceptually, the principles of measuring response diversity should still apply in multiple dimensions, but the analytical formulation must necessarily differ. If multiple environmental stressors act additively, then it should be mathematically straightforward to produce a multivariate response surface with performance varying as a function of two (or more) environmental conditions. However, where multiple stressors interact, nonadditivity adds additional layers of complication to the formulation of multivariate response surfaces (e.g., Yang et al. 2022). In such cases, a different analytical approach to estimating multivariate response diversity is needed. Future work should extend the univariate response diversity principles we suggest here into a multivariate framework for measuring response diversity to multiple environmental stressors. In doing so, multivariate response diversity may link conceptually to the study of co-tolerance to multiple stressors and stress-induced community sensitivity (see Vinebrooke et al. 2004) and should prove more operationalizable than univariate approaches given the propensity for multiple environmental stressors to not only co-occur but to interact nonadditively in nature (Dieleman et al. 2012).